NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Lessons and Insights from a Unifying Study of Parameter-Efficient Fine-Tuning (PEFT) in Visual Recognition

Mai, Zheda; Zhang, Ping; Tu, Cheng-Hao; Chen, Hong-You; Zhang, Li; Chao, Wei-Lun (June 2025, IEEE)

Free, publicly-accessible full text available June 15, 2026
CollabLLM: From Passive Responders to Active Collaborators

Wu, Shirley; Galley, Michel; Peng, Baolin; Cheng, Hao; Li, Gavin; Dou, Yao; Cai, Weixin; Zou, James; Leskovec, Jure; Gao, Jianfeng (July 2025, International Conference on Machine Learning)

Large Language Models are typically trained with next-turn rewards, limiting their ability to optimize for long-term interaction. As a result, they often respond passively to ambiguous or open-ended user requests, failing to help users reach their ultimate intents and leading to inefficient conversations. To address these limitations, we introduce COLLABLLM, a novel and general training framework that enhances multiturn human-LLM collaboration. Its key innovation is a collaborative simulation that estimates the long-term contribution of responses using Multiturn-aware Rewards. By reinforcement fine-tuning these rewards, COLLABLLM goes beyond responding to user requests, and actively uncovers user intent and offers insightful suggestions—a key step towards more humancentered AI. We also devise a multiturn interaction benchmark with three challenging tasks such as document creation. COLLABLLM significantly outperforms our baselines with averages of 18.5% higher task performance and 46.3% improved interactivity by LLM judges. Finally, we conduct a large user study with 201 judges, where COLLABLLM increases user satisfaction by 17.6% and reduces user spent time by 10.4%.
more » « less
Free, publicly-accessible full text available July 13, 2026
An invitation to the sample complexity of quantum hypothesis testing

https://doi.org/10.1038/s41534-025-00980-8

Cheng, Hao-Chung; Datta, Nilanjana; Liu, Nana; Nuradha, Theshani; Salzmann, Robert; Wilde, Mark M (June 2025, npj Quantum Information)

We study the sample complexity of quantum hypothesis testing, wherein the goal is to determine the minimum number of samples needed to reach a desired error probability. We characterize the sample complexity of binary quantum hypothesis testing in the symmetric and asymmetric settings, and we provide bounds on the sample complexity of multiple quantum hypothesis testing. The final part of our paper outlines and reviews how sample complexity of quantum hypothesis testing is relevant to a broad swathe of research areas and can enhance understanding of many fundamental concepts, including quantum algorithms for simulation and search, quantum learning and classification, and foundations of quantum mechanics. As such, we view our paper as an invitation to researchers coming from different communities to study and contribute to the problem of sample complexity of quantum hypothesis testing, and we outline a number of open directions for future research.
more » « less
Free, publicly-accessible full text available June 5, 2026
Feedrate optimization based on part-to-part learning in repeated machining

https://doi.org/10.1016/j.cirp.2025.04.043

Chou, Cheng-Hao; Shao, Chenhui; Okwudire, Chinedum E (January 2025, CIRP Annals)

Full Text Available
Colloidal Stability, Sedimentation, and Aggregation of Crystalline Two-Dimensional Crumpled Birnessite Flakes, Their Dye Adsorption and Immune Cell Response

https://doi.org/10.1021/acs.langmuir.4c03802

Hassig, Mary_Qin; Walter, Adam_D; Morris, Vanessa_R; Zhu, Yucheng; Ibrahim, Ahmed_M_H; Gordon, Abijah; Ibrahim, Mohamed_A; Cheng, Hao; Badr, Hussein_O; Barsoum, Michel_W (February 2025, Langmuir)
Fine-Tuning is Fine, if Calibrated

Mai, Zheda; Chowdhury, Arpita; Zhang, Ping; Tu, Cheng-Hao; Chen, Hong-You; Pahuja, Vardaan; Berger-Wolf, Tanya; Gao, Song; Stewart, Charles; Su, Yu; et al (December 2024, NeurIPS)

Full Text Available
Reviving the Context: Camera Trap Species Classification as Link Prediction on Multimodal Knowledge Graphs

Pahuja, Vardaan; Luo, Weidi; Gu, Yu; Tu, Cheng-Hao; Chen, Hong-You; Berger-Wolf, Tanya; Stewart, Charles; Gao, Song; Chao, Wei-Lun; Su, Yu (October 2024, ACM International Conference on Information and Knowledge Management (CIKM))

Full Text Available
Unmasking and Improving Data Credibility: A Study with Datasets for Training Harmless Language Models

Zhu, Zhaowei; Wang, Jialu; Cheng, Hao; Liu, Yang (May 2024, International Conference on Learning Representations)

Language models have shown promise in various tasks but can be affected by undesired data during training, fine-tuning, or alignment. For example, if some unsafe conversations are wrongly annotated as safe ones, the model fine-tuned on these samples may be harmful. Therefore, the correctness of annotations, i.e., the credibility of the dataset, is important. This study focuses on the credibility of real-world datasets, including the popular benchmarks Jigsaw Civil Comments, Anthropic Harmless & Red Team, PKU BeaverTails & SafeRLHF, that can be used for training a harmless language model. Given the cost and difficulty of cleaning these datasets by humans, we introduce a systematic framework for evaluating the credibility of datasets, identifying label errors, and evaluating the influence of noisy labels in the curated language data, specifically focusing on unsafe comments and conversation classification. With the framework, we find and fix an average of 6.16% label errors in 11 datasets constructed from the above benchmarks. The data credibility and downstream learning performance can be remarkably improved by directly fixing label errors, indicating the significance of cleaning existing real-world datasets.
more » « less
Full Text Available
Reviving the Context: Camera Trap Species Classification as Link Prediction on Multimodal Knowledge Graphs

https://doi.org/10.1145/3627673.3679545

Pahuja, Vardaan; Luo, Weidi; Gu, Yu; Tu, Cheng-Hao; Chen, Hong-You; Berger-Wolf, Tanya; Stewart, Charles; Gao, Song; Chao, Wei-Lun; Su, Yu (October 2024, ACM)

Full Text Available
Fabrication of Biodegradable Poly(caprolactone) (PCL) and Poly(3-Hydroxybutyrate-Co-3-Hydroxyvalerate) (PHBV) Electrospun Composite Membrane for Oil-Water Separation

https://doi.org/10.4236/msce.2024.1212010

Liu, Yaohui; Lee, Cheng-Hao; Wang, Yanming; Kan, Chi-Wai (January 2024, Journal of Materials Science and Chemical Engineering)

Full Text Available

« Prev Next »

Search for: All records